28 research outputs found

    Detection-aided medical image segmentation using deep learning

    Get PDF
    The details of the work will be defined once the student reaches the destination institution.A fully automatic technique for segmenting the liver and localizing its unhealthy tissues is a convenient tool in order to diagnose hepatic diseases and also to assess the response to the according treatments. In this thesis we propose a method to segment the liver and its lesions from Computed Tomography (CT) scans, as well as other anatomical structures and organs of the human body. We have used Convolutional Neural Networks (CNNs), that have proven good results in a variety of tasks, including medical imaging. The network to segment the lesions consists of a cascaded architecture, which first focuses on the liver region in order to segment the lesion. Moreover, we train a detector to localize the lesions and just keep those pixels from the output of the segmentation network where a lesion is detected. The segmentation architecture is based on DRIU (Maninis, 2016), a Fully Convolutional Network (FCN) with side outputs that work at feature maps of different resolutions, to finally benefit from the multi-scale information learned by different stages of the network. Our pipeline is 2.5D, as the input of the network is a stack of consecutive slices of the CT scans. We also study different methods to benefit from the liver segmentation in order to delineate the lesion. The main focus of this work is to use the detector to localize the lesions, as we demonstrate that it helps to remove false positives triggered by the segmentation network. The benefits of using a detector on top of the segmentation is that the detector acquires a more global insight of the healthiness of a liver tissue compared to the segmentation network, whose final output is pixel-wise and is not forced to take a global decision over a whole liver patch. We show experiments with the LiTS dataset for the lesion and liver segmentation. In order to prove the generality of the segmentation network, we also segment several anatomical structures from the Visceral dataset

    Hierarchical object detection with deep reinforcement learning

    Get PDF
    We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis. We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal. Experiments indicate better results for the overlapping candidate proposal strategy and a loss of performance for the cropped image features due to the loss of spatial resolution. We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.Postprint (published version

    RVOS: end-to-end recurrent network for video object segmentation

    Get PDF
    Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence. In our work, we propose a Recurrent network for multiple object Video Object Segmentation (RVOS) that is fully end-to-end trainable. Our model incorporates recurrence on two different domains: (i) the spatial, which allows to discover the different object instances within a frame, and (ii) the temporal, which allows to keep the coherence of the segmented objects along time. We train RVOS for zero-shot video object segmentation and are the first ones to report quantitative results for DAVIS-2017 and YouTube-VOS benchmarks. Further, we adapt RVOS for one-shot video object segmentation by using the masks obtained in previous time steps as inputs to be processed by the recurrent module. Our model reaches comparable results to state-of-the-art techniques in YouTube-VOS benchmark and outperforms all previous video object segmentation methods not using online learning in the DAVIS-2017 benchmark. Moreover, our model achieves faster inference runtimes than previous methods, reaching 44ms/frame on a P100 GPU.Peer ReviewedPostprint (published version

    Budget-aware semi-supervised semantic and instance segmentation

    Get PDF
    Methods that move towards less supervised scenarios are key for image segmentation, as dense labels demand significant human intervention. Generally, the annotation burden is mitigated by labeling datasets with weaker forms of supervision, e.g. image-level labels or bounding boxes. Another option are semi-supervised settings, that commonly leverage a few strong annotations and a huge number of unlabeled/weakly-labeled data. In this paper, we revisit semi-supervised segmentation schemes and narrow down significantly the annotation budget (in terms of total labeling time of the training set) compared to previous approaches. With a very simple pipeline, we demonstrate that at low annotation budgets, semi-supervised methods outperform by a wide margin weakly-supervised ones for both semantic and instance segmentation. Our approach also outperforms previous semi-supervised works at a much reduced labeling cost. We present results for the Pascal VOC benchmark and unify weakly and semi-supervised ap- proaches by considering the total annotation budget, thus allowing a fairer comparison between methods.Peer ReviewedPostprint (author's final draft

    Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster

    Get PDF
    Deep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. In this work, we explore how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied.This work is partially supported by the Spanish Ministry of Economy and Competitivity under contract TIN2012-34557, by the BSC-CNS Severo Ochoa program (SEV-2011-00067), by the SGR programmes (2014-SGR-1051 and 2014-SGR-1421) of the Catalan Government and by the framework of the project BigGraph TEC2013-43935-R, funded by the Spanish Ministerio de Economia y Competitividad and the European Regional Development Fund (ERDF). We also would like to thank the technical support team at the Barcelona Supercomputing center (BSC) especially to Carlos Tripiana.Peer ReviewedPostprint (published version

    A closer look at referring expressions for video object segmentation

    Get PDF
    The task of Language-guided Video Object Segmentation (LVOS) aims at generating binary masks for an object referred by a linguistic expression. When this expression unambiguously describes an object in the scene, it is named referring expression (RE). Our work argues that existing benchmarks used for LVOS are mainly composed of trivial cases, in which referents can be identified with simple phrases. Our analysis relies on a new categorization of the referring expressions in the DAVIS-2017 and Actor-Action datasets into trivial and non-trivial REs, where the non-trivial REs are further annotated with seven RE semantic categories. We leverage these data to analyze the performance of RefVOS, a novel neural network that obtains competitive results for the task of language-guided image segmentation and state of the art results for LVOS. Our study indicates that the major challenges for the task are related to understanding motion and static actions.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was partially supported by the projects PID2019-107255GB-C22 and PID2020-117142GB-I00 funded by MCIN/ AEI /10.13039/501100011033 Spanish Ministry of Science, and the grant 2017-SGR-1414 of the Government of Catalonia. This work was also partially supported by the project RTI2018-095232-B-C22 funded by the Spanish Ministry of Science, Innovation and Universities.Peer ReviewedPostprint (published version

    RVOS: end-to-end recurrent network for video object segmentation

    Get PDF
    Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence. In our work, we propose a Recurrent network for multiple object Video Object Segmentation (RVOS) that is fully end-to-end trainable. Our model incorporates recurrence on two different domains: (i) the spatial, which allows to discover the different object instances within a frame, and (ii) the temporal, which allows to keep the coherence of the segmented objects along time. We train RVOS for zero-shot video object segmentation and are the first ones to report quantitative results for DAVIS-2017 and YouTube-VOS benchmarks. Further, we adapt RVOS for one-shot video object segmentation by using the masks obtained in previous time steps as inputs to be processed by the recurrent module. Our model reaches comparable results to state-of-the-art techniques in YouTube-VOS benchmark and outperforms all previous video object segmentation methods not using online learning in the DAVIS-2017 benchmark. Moreover, our model achieves faster inference runtimes than previous methods, reaching 44ms/frame on a P100 GPU.This research was supported by the Spanish Ministry ofEconomy and Competitiveness and the European RegionalDevelopment Fund (TIN2015-66951-C2-2-R, TIN2015-65316-P & TEC2016-75976-R), the BSC-CNS SeveroOchoa SEV-2015-0493 and LaCaixa-Severo Ochoa Inter-national Doctoral Fellowship programs, the 2017 SGR 1414and the Industrial Doctorates 2017-DI-064 & 2017-DI-028from the Government of CataloniaPeer ReviewedPostprint (published version

    The Liver Tumor Segmentation Benchmark (LiTS)

    Get PDF
    In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via https://competitions.codalab.org/competitions/17094.Bjoern Menze is supported through the DFG funding (SFB 824, subproject B12) and a Helmut-Horten-Professorship for Biomedical Informatics by the Helmut-Horten-Foundation. Florian Kofler is Supported by Deutsche Forschungsgemeinschaft (DFG) through TUM International Graduate School of Science and Engineering (IGSSE), GSC 81. An Tang was supported by the Fonds de recherche du Québec en Santé and Fondation de l’association des radiologistes du Québec (FRQS- ARQ 34939 Clinical Research Scholarship – Junior 2 Salary Award). Hongwei Bran Li is supported by Forschungskredit (Grant NO. FK-21- 125) from University of Zurich.Peer ReviewedArticle signat per 109 autors/es: Patrick Bilic 1,a,b, Patrick Christ 1,a,b, Hongwei Bran Li 1,2,∗,b, Eugene Vorontsov 3,a,b, Avi Ben-Cohen 5,a, Georgios Kaissis 10,12,15,a, Adi Szeskin 18,a, Colin Jacobs 4,a, Gabriel Efrain Humpire Mamani 4,a, Gabriel Chartrand 26,a, Fabian Lohöfer 12,a, Julian Walter Holch 29,30,69,a, Wieland Sommer 32,a, Felix Hofmann 31,32,a, Alexandre Hostettler 36,a, Naama Lev-Cohain 38,a, Michal Drozdzal 34,a, Michal Marianne Amitai 35,a, Refael Vivanti 37,a, Jacob Sosna 38,a, Ivan Ezhov 1, Anjany Sekuboyina 1,2, Fernando Navarro 1,76,78, Florian Kofler 1,13,57,78, Johannes C. Paetzold 15,16, Suprosanna Shit 1, Xiaobin Hu 1, Jana Lipková 17, Markus Rempfler 1, Marie Piraud 57,1, Jan Kirschke 13, Benedikt Wiestler 13, Zhiheng Zhang 14, Christian Hülsemeyer 1, Marcel Beetz 1, Florian Ettlinger 1, Michela Antonelli 9, Woong Bae 73, Míriam Bellver 43, Lei Bi 61, Hao Chen 39, Grzegorz Chlebus 62,64, Erik B. Dam 72, Qi Dou 41, Chi-Wing Fu 41, Bogdan Georgescu 60, Xavier Giró-i-Nieto 45, Felix Gruen 28, Xu Han 77, Pheng-Ann Heng 41, Jürgen Hesser 48,49,50, Jan Hendrik Moltz 62, Christian Igel 72, Fabian Isensee 69,70, Paul Jäger 69,70, Fucang Jia 75, Krishna Chaitanya Kaluva 21, Mahendra Khened 21, Ildoo Kim 73, Jae-Hun Kim 53, Sungwoong Kim 73, Simon Kohl 69, Tomasz Konopczynski 49, Avinash Kori 21, Ganapathy Krishnamurthi 21, Fan Li 22, Hongchao Li 11, Junbo Li 8, Xiaomeng Li 40, John Lowengrub 66,67,68, Jun Ma 54, Klaus Maier-Hein 69,70,7, Kevis-Kokitsi Maninis 44, Hans Meine 62,65, Dorit Merhof 74, Akshay Pai 72, Mathias Perslev 72, Jens Petersen 69, Jordi Pont-Tuset 44, Jin Qi 56, Xiaojuan Qi 40, Oliver Rippel 74, Karsten Roth 47, Ignacio Sarasua 51,12, Andrea Schenk 62,63, Zengming Shen 59,60, Jordi Torres 46,43, Christian Wachinger 51,12,1, Chunliang Wang 42, Leon Weninger 74, Jianrong Wu 25, Daguang Xu 71, Xiaoping Yang 55, Simon Chun-Ho Yu 58, Yading Yuan 52, Miao Yue 20, Liping Zhang 58, Jorge Cardoso 9, Spyridon Bakas 19,23,24, Rickmer Braren 6,12,30,a, Volker Heinemann 33,a, Christopher Pal 3,a, An Tang 27,a, Samuel Kadoury 3,a, Luc Soler 36,a, Bram van Ginneken 4,a, Hayit Greenspan 5,a, Leo Joskowicz 18,a, Bjoern Menze 1,2,a // 1 Department of Informatics, Technical University of Munich, Germany; 2 Department of Quantitative Biomedicine, University of Zurich, Switzerland; 3 Ecole Polytechnique de Montréal, Canada; 4 Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands; 5 Department of Biomedical Engineering, Tel-Aviv University, Israel; 6 German Cancer Consortium (DKTK), Germany; 7 Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany; 8 Philips Research China, Philips China Innovation Campus, Shanghai, China; 9 School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK; 10 Institute for AI in Medicine, Technical University of Munich, Germany; 11 Department of Computer Science, Guangdong University of Foreign Studies, China; 12 Institute for diagnostic and interventional radiology, Klinikum rechts der Isar, Technical University of Munich, Germany; 13 Institute for diagnostic and interventional neuroradiology, Klinikum rechts der Isar,Technical University of Munich, Germany; 14 Department of Hepatobiliary Surgery, the Affiliated Drum Tower Hospital of Nanjing University Medical School, China; 15 Department of Computing, Imperial College London, London, United Kingdom; 16 Institute for Tissue Engineering and Regenerative Medicine, Helmholtz Zentrum München, Neuherberg, Germany; 17 Brigham and Women’s Hospital, Harvard Medical School, USA; 18 School of Computer Science and Engineering, the Hebrew University of Jerusalem, Israel; 19 Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, PA, USA; 20 CGG Services (Singapore) Pte. Ltd., Singapore; 21 Medical Imaging and Reconstruction Lab, Department of Engineering Design, Indian Institute of Technology Madras, India; 22 Sensetime, Shanghai, China; 23 Department of Radiology, Perelman School of Medicine, University of Pennsylvania, USA; 24 Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, PA, USA; 25 Tencent Healthcare (Shenzhen) Co., Ltd, China; 26 The University of Montréal Hospital Research Centre (CRCHUM) Montréal, Québec, Canada; 27 Department of Radiology, Radiation Oncology and Nuclear Medicine, University of Montréal, Canada; 28 Institute of Control Engineering, Technische Universität Braunschweig, Germany; 29 Department of Medicine III, University Hospital, LMU Munich, Munich, Germany; 30 Comprehensive Cancer Center Munich, Munich, Germany; 31 Department of General, Visceral and Transplantation Surgery, University Hospital, LMU Munich, Germany; 32 Department of Radiology, University Hospital, LMU Munich, Germany; 33 Department of Hematology/Oncology & Comprehensive Cancer Center Munich, LMU Klinikum Munich, Germany; 34 Polytechnique Montréal, Mila, QC, Canada; 35 Department of Diagnostic Radiology, Sheba Medical Center, Tel Aviv university, Israel; 36 Department of Surgical Data Science, Institut de Recherche contre les Cancers de l’Appareil Digestif (IRCAD), France; 37 Rafael Advanced Defense System, Israel; 38 Department of Radiology, Hadassah University Medical Center, Jerusalem, Israel; 39 Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, China; 40 Department of Electrical and Electronic Engineering, The University of Hong Kong, China; 41 Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China; 42 Department of Biomedical Engineering and Health Systems, KTH Royal Institute of Technology, Sweden; 43 Barcelona Supercomputing Center, Barcelona, Spain; 44 Eidgenössische Technische Hochschule Zurich (ETHZ), Zurich, Switzerland; 45 Signal Theory and Communications Department, Universitat Politecnica de Catalunya, Catalonia, Spain; 46 Universitat Politecnica de Catalunya, Catalonia, Spain; 47 University of Tuebingen, Germany; 48 Mannheim Institute for Intelligent Systems in Medicine, department of Medicine Mannheim, Heidelberg University, Germany; 49 Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Germany; 50 Central Institute for Computer Engineering (ZITI), Heidelberg University, Germany; 51 Department of Child and Adolescent Psychiatry, Ludwig-Maximilians-Universität, Munich, Germany; 52 Department of Radiation Oncology, Icahn School of Medicine at Mount Sinai, NY, USA; 53 Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, South Korea; 54 Department of Mathematics, Nanjing University of Science and Technology, China; 55 Department of Mathematics, Nanjing University, China; 56 School of Information and Communication Engineering, University of Electronic Science and Technology of China, China; 57 Helmholtz AI, Helmholtz Zentrum München, Neuherberg, Germany; 58 Department of Imaging and Interventional Radiology, Chinese University of Hong Kong, Hong Kong, China; 59 Beckman Institute, University of Illinois at Urbana-Champaign, USA; 60 Siemens Healthineers, USA; 61 School of Computer Science, the University of Sydney, Australia; 62 Fraunhofer MEVIS, Bremen, Germany; 63 Institute for Diagnostic and Interventional Radiology, Hannover Medical School, Hannover, Germany; 64 Diagnostic Image Analysis Group, Radboud University Medical Center, Nijmegen, The Netherlands; 65 Medical Image Computing Group, FB3, University of Bremen, Germany; 66 Departments of Mathematics, Biomedical Engineering, University of California, Irvine, USA; 67 Center for Complex Biological Systems, University of California, Irvine, USA; 68 Chao Family Comprehensive Cancer Center, University of California, Irvine, USA; 69 Division of Medical Image Computing, German Cancer Research Center (DKFZ), Heidelberg, Germany; 70 Helmholtz Imaging, Germany; 71 NVIDIA, Santa Clara, CA, USA; 72 Department of Computer Science, University of Copenhagen, Denmark; 73 Kakao Brain, Republic of Korea; 74 Institute of Imaging & Computer Vision, RWTH Aachen University, Germany; 75 Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, China; 76 Department of Radiation Oncology and Radiotherapy, Klinikum rechts der Isar, Technical University of Munich, Germany; 77 Department of computer science, UNC Chapel Hill, USA; 78 TranslaTUM - Central Institute for Translational Cancer Research, Technical University of Munich, GermanyPostprint (published version
    corecore